-
-
Notifications
You must be signed in to change notification settings - Fork 10.7k
[BugFix] Fix default kv-cache-dtype default for DeepseekV3.2 #25988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Lucas Wilkinson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request addresses a bug where the specific configuration for DeepseekV3.2 models was not being applied. The changes involve renaming the configuration class to DeepseekV32ForCausalLM
and updating the model configuration map accordingly. The logic for applying the custom KV cache settings has also been simplified. While the changes are generally good, I've identified a potential issue in the handling of the bfloat16
cache data type that could lead to unexpected behavior.
Signed-off-by: Lucas Wilkinson <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]> Signed-off-by: simon-mo <[email protected]>
…oject#25988) Signed-off-by: Lucas Wilkinson <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]> Signed-off-by: yewentao256 <[email protected]>
…oject#25988) Signed-off-by: Lucas Wilkinson <[email protected]> Signed-off-by: Tomer Asida <[email protected]>
…oject#25988) Signed-off-by: Lucas Wilkinson <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
…oject#25988) Signed-off-by: Lucas Wilkinson <[email protected]> Signed-off-by: simon-mo <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]> Signed-off-by: simon-mo <[email protected]>
…oject#25988) Signed-off-by: Lucas Wilkinson <[email protected]>
In the final release DeepseekV32 was being registered separately so the config override was not getting picked-up anymore